Two important submission csvs were written wrong, but in anticipation of this problem we pickled the results. Opening them now.


In [1]:
import pickle

In [2]:
cd /disk/scratch/neuroglycerin/dump/


/disk/scratch/neuroglycerin/dump

In [3]:
ls


test2.py.pkl  test.py.pkl

In [4]:
with open("test.py.pkl","rb") as f:
    p = pickle.load(f)

In [5]:
len(p)


Out[5]:
80

In [7]:
p[0].shape[0]*80


Out[7]:
130400

Looks like everything should be there, just have to figure out why it didn't write these to the csv right. Next part was the stack:


In [8]:
import numpy as np

In [9]:
y = np.vstack(p)

In [10]:
y.shape


Out[10]:
(130400, 121)

That worked, what about finding the name for the csv?


In [11]:
import neukrill_net.utils

In [13]:
cd ~/repos/neukrill-net-work/


/afs/inf.ed.ac.uk/user/s08/s0805516/repos/neukrill-net-work

In [14]:
settings = neukrill_net.utils.Settings("settings.json")

In [17]:
import os

In [18]:
names = [os.path.basename(n) for n in settings.image_fnames['test']]

In [19]:
len(names)


Out[19]:
130400

That also seems to be fine...

Only explanation I can think of at this point is that it somehow redefined the image_fname dict to be over one of the splits. But that makes no sense because the image_fname dictionary that gets modified is a different instance to that in the test.py script.

Looking at the submission csvs:


In [22]:
cd /disk/scratch/neuroglycerin/submissions/


/disk/scratch/neuroglycerin/submissions

In [23]:
ls


alexnet_based_16aug.csv.gz
alexnet_based_40aug_backup.csv.gz
alexnet_based_40aug.csv.gz
alexnet_based_40aug_prior_weighted.csv.gz
alexnet_based.backup.csv*
alexnet_based.csv
alexnet_based.csv.gz.backup*
alexnet_based_extra_convlayer.csv.gz
alexnet_based_fixed.csv.gz
alexnet_based_norm_global_8aug.csv.gz
alexnet_based_norm_global.csv.gz
alexnet_based_norm_pixel.csv.gz
alexnet_based_objective.csv.gz
alexnet_based_objective.csv.gz.backup
combine_40aug_class_predictions.csv.gz
combine_disparate_models.csv.gz
combine_more_disparate_models.csv.gz
fewer_conv_channels_with_dropout_resume.csv.gz
parallel_conv.csv.gz
replicate_8aug.csv.gz
superclasses_online.csv.gz
testing_combiner.csv.gz

In [24]:
!gzip -d alexnet_based_40aug.csv.gz

In [26]:
!wc -l alexnet_based_40aug.csv


1631 alexnet_based_40aug.csv

The splits would have been equal to the full dataset divided by 80:


In [27]:
130400/80


Out[27]:
1630

Including the header, that's exactly correct.

All we can do now is rewrite the submission csv with the full names and submit it to check it's valid.


In [34]:
neukrill_net.utils.write_predictions("alexnet_based_40aug.csv",y,names,settings.classes)

And we have to do the same for 16aug predictions.


In [35]:
cd /disk/scratch/neuroglycerin/dump/


/disk/scratch/neuroglycerin/dump

In [36]:
with open("test2.py.pkl","rb") as f:
    p16aug = pickle.load(f)

In [37]:
y16aug = np.vstack(p16aug)
y16aug.shape


Out[37]:
(130400, 121)

In [38]:
cd /disk/scratch/neuroglycerin/submissions/


/disk/scratch/neuroglycerin/submissions

In [39]:
neukrill_net.utils.write_predictions("alexnet_based_16aug.csv.gz",y16aug,names,settings.classes)